REMARKS 

Claims 1-37 and 39-42 were presented for examination and were pending in this 
application. In an Official Action dated July 5, 2005, claims 1-37 and 39-42 were rejected. 
Applicants thank Examiner for examination of the claims pending in this application and 
addresses Examiner's comments below. 

Applicants herein cancel claim 13 and amend claims 1, 7-10, 16, 18, 20, 26, 28, 29, 
31-33, 35, 36, and 39-42. Reconsideration of the application in view of the above 
amendments and the following remarks is respectfully requested. 

Claim Amendments 

In addition to amendments made to independent claims 1,7-10, 26, 35, and 40, which 
are discussed in more detail below in the sections regarding rejections under 35 U.S.C. § 102 
and 35 U.S.C. § 103, Applicants have amended the following claims to clarify grammar and 
correct antecedent basis noticed during prosecution. 

Claims 10, 26, and 36 have been amended to clarify antecedent basis associated with 
the phrase "an image." Claims 16, 18, 20, 28, 29, 41, and 42 have been amended to clarify 
antecedent basis associated with the phrase "annotation object." The claim dependencies of 
claims 31-33 have been amended to correct lack of antecedent basis arising from the original 
claim misnumbering. Claims 35 and 39 have been amended to clarify the selection of the 
"visual notation" of the "image." Claims 41-42 were additionally amended to correct 
grammatical errors. 

Applicants respectfully submit that no new matter is introduced as a result of these 
amendments. 
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Claim Objections 

The Examiner objected to the numbering of claims, stating in particular, 
"[m]isnumbered claim 37 should been renumbered claim 38." 

Applicants acknowledge Examiner's objection and confirm that claim 38 is canceled. 

Claim Rejection Under 35 U.S.C. § 102(b) 

In the 2 nd paragraph of the Office Action, the Examiner rejected claims 35-37 and 39 
under 35 U.S.C. § 102(b) as allegedly being anticipated by Lin, "An Ink and Voice 
Annotation System for DENIM," ("Lin"). This rejection is now traversed. 

Independent Claim 35 

Claim 35 has been amended to recite in part, "generating the annotation 
automatically , in response to user input of a location within the image and an audio input , 
including automatically terminating a recording of the audio input based on a predetermined 
audio level ." Applicants respectfully submit that claim 35 as amended recites a method, 
based on the automatic creation of an annotation, with an ease of use that is not reflected in 
Lin. As indicated in paragraph 6 of the specification, "[o]nce an image is displayed, the user 
need only select an image and speak to create an annotation. The system automatically 
creates the annotation. . ." In particular, as described in paragraph 44 of the specification: 

by indicating a position on a display device 100 through clicking, pointing, or 
touching the display screen, creation of an annotation is completed and audio 
recording is initiated . Audio recording may cease when the audio level drops 
below a predetermined threshold ... 

The ease of use of the claimed method relates, for example, to both initiating and completing 
the audio recording. As indicated in paragraph 46 of the specification: 
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In one embodiment, control unit 150 initiates audio recording in response to 
detecting a positional stimulus , whereas in an alternate embodiment, control 
unit 1 50 automatically initiates audio recording upon detecting audio input 
above a predetermined threshold level . Similarly, audio recording may_ 
automatically be terminated upon the audio level dropping below a 
predetermined threshold or upon control unit 150 detecting a predetermined 
duration of silence where there is no audio input. 

These passages describe annotations created "automatically", for example, by input of a 
positional stimulus (e.g., identifying a location within an image) or audio signal, and 
terminated automatically, for example, by a reduced signal level or silence. As summarized 
in paragraph 66 of the specification, "The annotation process of the present invention is 
particularly advantageous because it provides for easy annotation of images with a ' point and 
talk' methodology ." 

This "point and talk methodology" is not disclosed or suggested in Lin. Lin requires 
four separate steps to create an annotation. Lin requires: 1) selection of a microphone tool, 
2) identification of a position for the annotation, 3 ) starting of recording and 4) manual 
termination of the audio recording. In particular, as described on page 3, lines 13-19, of Lin, 
"the [user] first taps on the microphone to pick it up." Then, the user "taps [a part of the 
design he wants to comment on] with the microphone." Then, in the "sound recorder" 
window, the user must click a "record" button to initiate recording (see also Figure 4b). 
Finally, to terminate the annotation generation, "the user closes the window, [and] a speaker 
icon appears where the user tapped." Thus, Lin does not provide for displaying objects with 
annotations including " generating the annotation automatically , in response to user input of a 
location within the image and an audio input, including automatically terminating a recording 
of the audio input based on a predetermined audio level," as required by Applicants amended 
Claim 35. 
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Based on the above Amendment and Remarks, Applicants respectfully submit that for 
at least these reasons independent claims 35 is patentably distinguishable over Lin. 
Therefore, Applicants respectfully request that Examiner reconsider the rejection, and 
withdraw it. 

Dependent Claims 36, 37, and 39 

As claims 36-37 and 39 are dependent on claim 35, all arguments advanced above 
with respect to claim 35 are hereby incorporated so as to apply to claims 36-37 and 39. 
These claims also have additional patentable features, in particular multi-modal annotations, 
not disclosed in Lin. 

Specifically, regarding claim 36, Lin does not enable text associated with a visual 
notation. The text identified by the Examiner (Fig 3 of Lin) in reference to claim 36 is not 
associated with the speaker icon shown in Fig. 4c of Lin. Thus, Lin does not u determin[e] 
whether the annotation includes text," as required by Applicants' claim 35, from which claim 
36 depends. 

Specifically, regarding claim 39, Lin does not "determin[e] whether the annotation 
includes an audio signal," as required by Applicants' claim 39, since audio is the only item 
associated with the speaker icon of Lin. In contrast, Applicants' disclosure enables multi- 
modal annotations, which may be, for example, text, audio, or other media types. Lin does 
not teach such multi-modal annotations. 

Thus, Applicants respectfully submit that for at least these additional reasons claims 
36 and 39 are patentably distinguishable over Lin. 



Case 06364 (Amendment B) 
U.S. Serial No. 10/043,575 



17 



204 1 2/06364/DOCS/l 550 1 27. 1 



In sum, based on the above Amendments and Remarks, Applicants respectfully 
submit that for at least these reasons claims 35-37 and 39 are patentably distinguishable over 
the cited reference. Therefore, Applicants respectfully request that Examiner reconsider the 
rejection, and withdraw it. 

Claim Rejection Under 35 U.S.C. $ 103(a) 

In the 3 rd and 4 th paragraphs of the Office Action, the Examiner rejected claims 1-34 
and 40, and 41-42, respectively, under 35 U.S.C. § 103(a) as allegedly being unpatentable 
over U.S. Patent No. 6,499,016 to Anderson ("Anderson") in view of Balabanovic, 
"Multimedia Chronicles for Business Communication ("Balabanovic"). These rejections are 
respectfully traversed. 

Independent Claims 1,7-10, 26, and 40 

Applicants respectfully submit that independent claims 1,7-10, 26, and 40 recite 
apparatus and methods based on the automatic creation of a complete annotation associated 
with a particular location within the image. The automatic creation is characterized by an 
ease of use, including automatic termination of the recording of an audio input , which is not 
reflected in Anderson or Balabanovic, either alone or in combination. 

Each independent claim 1,7-10, 26, and 40 embodies these features. Amended 
claim 1 recites an apparatus for direct annotation of objects including a direct annotation 
creation module, with: 

the direct annotation creation module, in response to receiving the input audio 
signal and the reference to the location within the image, automatically 
creating an annotation object . . . and the direct annotation creation module 
automatically terminating a recording of the input audio signal based on a 
predetermined audio level. 
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Amended claims 7-9 recite additional embodiments of an apparatus for direct annotation of 



objects in which: 

the direct annotation creation module, in response to receiving the input audio 
signal or the reference to the location within the image, automatically creating 
an annotation object . . . and the direct annotation creation module 
automatically terminating a recording of the input audio signal based on a 
predetermined audio level. 

Amended claims 10 and 26, directed to computer-implemented methods for direct annotation 

of objects, recite: 

creating an annotation object, . . ., the creating step occurring automatically in 
response to the receiving or detecting steps and including automatically 
terminating a recording of the audio input based on a predetermined audio 
level. 

Amended claim 40, directed to a computer-implemented method for retrieving images, 
recites: 

determining annotation objects that reference a close match to the audio input, 
each annotation object generated automatically in response to user input of a 
location within an image and an audio signal, where a recording of the audio 
signal is terminated automatically based on a predetermined audio level. 

As discussed above with reference to the 35 U.S.C. § 102(b), the discussion of which 

is herein incorporated, the specification enables, and the independent claims 1,7-10, 26, and 

40 recite, complete annotations created automatically , for example, by mere input of a 

positional stimulus (e.g., identifying a location within an image) or audio signal, and 

terminated automatically , for example, by a reduced signal level or silence (see the 

discussion above regarding paragraphs 6, 44, and 46 of the specification). As pointed out 

previously, the invention is characterized by an ease of use making "[t]he annotation process 

of the present invention [] particularly advantageous because it provides for easy annotation 

of images with a ' point and talk' methodology " (paragraph 66 of the specification). 
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Just as Lin does not reflect the ease of use embodied by Applicants' claims, neither 
does Anderson or Balabanovic, either alone or in combination. Anderson requires either an 
image capture or manual initiation by the user to enter the annotation mode (col. 4, lines27- 
30). In particular, during sequential recoding, the user must both manually initiate and 
manually terminate a particular voice recording once in the annotation mode. "For example, 
the user may press the audio button 22 and speak 'Caption' followed by a description, and 
then press the audio button 22 again to end the recording session." (col. 4, lines 53-56) This 
process must be repeated for each annotation (col. 4, lines 48-60). Balabanovic, although 
characterized as a "point and talk methodology" requires an additional click to stop an audio 
signal. "Clicking anywhere causes recording to begin . . .Clicking elsewhere on the screen 
stops the first bar growing [the recording] . . ." (col. 7, lines 54, 61-62). 

Neither Anderson nor Balabanovic, either alone or in combination, discloses 
automatic creation of a complete annotation. In particular, neither discloses or suggests 
annotations created automatically , for example, by mere input of an audio signal, or 
annotations terminated automatically , for example, by a reduced signal level or silence (see 
the discussion under 35 U.S.C. § 102(b) regarding paragraph 46 of the specification). These 
features are embodied in Applicants' independent claims 1,7-10, 26, and 40, as quoted 
above. Thus, Applicants respectfully submit that independent claims 1,7-10, 26, and 40 are 
patentably distinguishable over Anderson and Balabanovic, both alone and in combination. 

Another particular feature of the present invention is annotation of particular 
locations of an imafie . As described in paragraph 46 of the specification, "After the input is 
complete and/or a text equivalent found, a symbol or text representing the annotation is 
added to the image at the position of the image of the positional stimulus ." As shown in 
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Figures 1 1C and 1 ID, which are mentioned, for example, in paragraphs 67 and 46, 
respectively, an image may have several annotations associated with different locations of 
the image . 

Each independent claim 1,7-10, 26, 35 and 40 embodies the feature of associating an 
annotation with a particular location within the image. Claim 1 recites "the direct annotation 
creation module, in response to receiving the input audio signal and the reference to the 
location within the image , automatically creating an annotation object, independent from the 
image, that associates the input audio signal with the location ." Claims 7-9 each include a 
similar limitation as claim 1 . Claims 10 and 26 variously recite "detecting selection of a 
location within the image " and "creating an annotation object, independent of the selected 
image, between the selected location and the audio input [or text]." Claim 35 recites 
"generating the annotation automatically, in response to user input of a location within the 
image and an audio input ." Claim 40 includes a limitation similar to claim 35. 

Based on the above Amendment and Remarks, Applicants respectfully submit that for 
at least these reasons independent claims 1, 7-10, 26, and 40 are patentably distinguishable 
over Anderson and Balabanovic, both alone and in combination. Therefore, Applicants 
respectfully request that Examiner reconsider the rejections, and withdraw them. 

Dependent Claims 2-6, 1 1-12, 14-25, 27-34, and 41-42 

Claims 2-6 are dependent on claim 1, claims 11-12 and 14-25 are dependent on claim 
10, claims 27-34 are dependent on claim 26, and claims 41-42 are dependent on claim 40. 
Thus, all arguments advanced above with respect to independent claims 1,10, 26, and 40 are 
hereby incorporated so as to apply to claims 2-6, 11-12 and 14-25, 27-34, and 41-42, 
respectively. 
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Applicants also respectfully submit that claims 4, 5, 21-23, 24-25, 27-34, and 41-42 
also have additional patentable features not disclosed or suggested in Anderson and 
Balabanovic, either alone or in combination. 

Specifically regarding claim 4, the cited references do not disclose or suggest "an 
audio vocabulary storage for storing a plurality of audio signals and corresponding text 
strings," as recited in claim 4. In particular, Anderson translates the input audio signals via 
voice recognition (col. 5, lines 31-35), which may require later user correction (col. 5, lines 
47-51), rather than "finding a corresponding text string that matches the audio input," as 
recited by Applicants' claim 4. Additionally, Balabanovic does not allow text annotations 
(col. 6, lines 30-35). This argument also applies to Applicants' claims 21-23, which similarly 
recite u a vocabulary," and to Applicants' claims 27-34, which include the "vocabulary" 
recited in claim 26, from which claims 27-34 depend. 

Specifically regarding claim 5, the cited references do not disclose or suggest "a 
dynamic vocabulary updating module ... to create a new entry in the audio vocabulary 
storage," as recited by Applicants' claim 5. This feature facilitates customization of the 
annotation process, which is not possible in the cited references. This argument also applies 
to Applicants' claims 24-25 and 33-34, regarding "updating the vocabulary." 

Specifically regarding claims 41 and 42, the cited references do not disclose or 
suggest " comparing the audio input to an audio signal reference of the annotation object," as 
recited by claims 41 and 42. This feature facilitates audio searching of audio annotations. 
Anderson, which is the only cited reference that describes searching the annotations, 
performs text searching using a text input keyword to search amongst text strings that were 
translated from audio signals (col. 6, lines 32-42). 
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Thus, Applicants respectfully submit that for at least these additional reasons claims 
4, 5, 21-23, 24-25, 27-34, and 41-42 are patentably distinguishable over the cited references, 
either alone or in combination. 

In sum, based on the above Amendment and Remarks, Applicants respectfully submit 
that for at least these reasons claims 1-12, 14-34, and 40-42 are patentably distinguishable 
over Anderson and Balabanovic, both alone and in combination. Therefore, Applicants 
respectfully request that Examiner reconsider the rejections, and withdraw them. 
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CONCLUSION 

In sum, Applicants respectfully submit that claims 1-12, 14-37, and 39-42, as 
presented herein, are patentably distinguishable over all of the art of record. Therefore, 
Applicants request reconsideration of the basis for the rejections to these claims and request 
allowance of the claims. 

In addition, Applicants respectfully invite Examiner to contact Applicants' 
representative at the number provided below if Examiner believes it will help expedite 
furtherance of this application. 

Respectfully Submitted, 

GREGORY J. WOLFF AND EETER E. HART 



Date: /// */f J By: 




Greg T. SueoKa, Reg. No.: 33,800 
Fenwick & West LLP 
Silicon Valley Center 
801 California Street 
Mountain View, CA 94041 
Tel.: (650)335-7194 
Fax: (650) 938-5200 
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